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PREFACE 

This technical report is based on the master 's thesis of RogerC. Sweet. 

Thesis committee members were Herbert J. Klausmeier, Chairman; 

Gary A. Davis; and Thomas Ringness. < 

In our program of research and development at the R & D Center for 
Learning and Re-Education, we have identified sets of variables related to 
five main categories -- stimulus material, instructions, response modes, 
conditions of learning, and organismic. The complete taxonomy is present- 
ed in Technical Report No. 1 of the Center. 

In the present study, Mr. Sweet examined the relationship between the 
teacher's written comments on a test paper (informative feedback) and sub- 
sequent attainment and attitudes toward a school subject, namely, ninth grade 
English. 



Herbert J. Klausmeier 
Co-Director for Research 
Professor of Educational Psychology 



^ ^ >u t^i J UU^4J U I- ^«pnN^pp^|pgpv^9|9niii|^pppppp^pp^ppv||f«pii9^pP9pR^ipppR9Vi!9PPPC^ !-' WPH9.JI «IU . 



w» 






rjtssm£^xsx!^sr 



i. 

ii. 



in. 

iv. 



CONTENTS 


Page 


List of Tables 


vii 


Abstract 


ix 


Introduction 


1 


Procedure 


4 


Subjects 


4 


Data Gathered 


4 


Method 


4 


General 


4 


Scholastic Achievement 


5 


Attitude Change 


6 


Statistical Treatment 


6 


Results 


7 


N 

Discussion 


9 


Scholastic Achievement 


9 


Attitude Change 


10 


Concluding Statements 


10 


References 


11 



o 

ERIC 



r 






LIST OF TABLES 



T a bl e Page 

1 Illustration of Ranked Data 5 

2 Illustration of the Number of Levels in Each 

Class 5 

3 Procedure Used for Reranking Data £ 

4 Friedman Test of Over-all Treatment Effects 7 

5 Friedman Analysis of Reranked Data (K = 2 ) 

Across Individual Subjects 7 

6 Attitude Change Based on the Wilcoxon Matched- 

Pairs Signed-Ranks Test g 



vii 







/ 

i 



abstract 



\ 



The study dealt with a partially established relationship between a 
teacher’s written comments on a test and subsequent student attainment as 
measured by test performance. In addition, an attempt was made to ascer- 
tainany attitude change towards a particular school subject (9th grade English) 
as a function of the teacher's comment. Three classes of each of three teach- 
ers comprised the sample of 225 students taking ninth grade English. Over a 
perio. 1 of six weeks, each teacher gave four tests which were not of the long 
essay type. When each test had been corrected, the teacher returned the 
test papers with the numerical score and letter grade as earned. No Com- 
ment (N) students received only the numerical score and letter grade. Free 
Comment (F) students received whatever comment the teacher felt it desire- 
able to make. The Specified Comment (S) students received comments desig- 
nated in advance for each letter grade. Attitude inventory scores, based on 
Osgood’s semantic differential (evaluative dimension), were collected on the 
day before the administration of the first test and soon after the return of the 
fourth test. Because of the qualitative and quantitative differences between 
all the different tests used by different teachers, the tests were regarded as 
ranking instruments. 

The Friedman Two-Way ANOVA was used to analyze the ranked data 
across individual subjects and across classes. Comments of either a free or 
specified nature have little if any short-term effect on test performance; over 
a longer period of time, the inclusion of free comments has a significant 
effect on scholastic performance. Attitudes were analyzed by using the 
Wilcoxon Matched-Pairs Signed-Ranks test. A highly negative Z signified 
that only under the free comment condition were attitudes significantly 
changed in a positive direction. This indicated that the inclusion of specified 
comments was no more effective in changing attitudes than were no comments. 
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INTRODUCTION 



The following investigation dealt with a 
partially established relationship between a 
teacher’s written comments on a test and sub- 
sequent student attainment as measured by 
test performance (Page, 1958a). In addition, 
an attemptwas made to ascertain any attitude 
change towards a particular school subject 
(9th grade English) as a function of the teach- 
er's comment. 

If teachers' comments were treated as 
social reinforcement within the usual S-R 
framework, a multitude of theoretical ques- 
tions would arise for which there were no 
readily available answers. Treating com- 
ments as a reinforcer leads us to the question 
of what might happen if the comments were 
discontinued. According to the basic princi- 
ples of S-R theory, extinction should follow. 
However, in a high school classroom, it 
would be difficult indeed to decide just exactly 
what it is that is being extinguished. As Page 
(1958a, 1958b) pointed out, investigations of 
praise and blame have provided very fruitful 
knowledge for the general psychologist. For 
the educator, however, these same investiga- 
tions are belabored by many weaknesses. It 
was this author's opinion that most experimen- 
tal attempts to measure the effects of praise 
and blame have been accomplished under spe- 
cially arranged situations where the effects of 
extraneous factors have been minimized and 
the verification of basic S-R principles vir- 
tually assured. This does not deny the great 
value of these basic principles to the area of 
human learning; however, it does suggest that 
the extension to the classroom may not pro- 
vide an adequate test of the theory. 

A teacher's comment may be more ac- 
ceptably treated as a type of feedback. Fur- 
ther, as Bilodeau and Bilodeau (1961) stated, 
"Studies of feedback or knowledge of results 
show it to be the strongest most important 
variable controlling performance and learn- 
ing. " The comments of teachers in the present 
study could be viewed as feedback in that the 
s tudent noted his errors and correct responses 



and saw the grade and the evaluative comments 
which were put on his test paper. One type of 
feedback was the error corrections. The stu- 
dent was told the direction and extent of his 
errors with the information supplied in the 
form of error corrections, thus serving the 
directing function of feedback. 

Teachers' comments are also feedback, 
but feedback of a different nature. Like letter 
grades, the comments are a reinforcement 
component providing feedback to the student 
about some of the effects of his behavior. 
They are examples of the teacher's communi- 
cation of approval or disapproval over the 
student's work. A look at the index of the 
Psychological Abstracts may aid in illustrating 
this point. Under the word "Feedback" is the 
statement, "See also Knowledge of Results 
and Reinforcement. " Solomon and Rosenberg 
(1964) broke down feedback in a very similar 
manner. Their article was intended to illus- 
trate how teacher-student feedback could af- 
fect the social structure of the classroom. 
Though their particular problem holds little 
immediate relevance to the present topic, 
their method of analyzing the concept of feed- 
back is of great importance. They wrote of 
an informational as well as a reinforcement 
component. By an informational component, 
they meant indicating correct answers (telling 
the student that ananswer is right , if it is; or 
telling him that it is wrong and providing the 
correct answer). The reinforcement compo- 
nent referred to the teacher's communication 
of approval or disapproval. This dualistic 
conception of feedbackdoes not differ from the 
informational and reinforcement components 
involved in programmed instructional se- 
quences. Schvaneveldt (1965), inanextreme- 
ly thorough review of the informational com- 
ponent of feedback, stated that a performance - 
related signal may be called anything from 
reward to knowledge of performance, because 
of the dimensions on which it could vary. 

Though it proves impossible to dismiss 
all the present ambiguity surrounding such 
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terms as feedback and reinforcement, thepre- 
sent writer chose to treat teachers 1 comments 
as being, for the most pa *t, a reinforcement 
component of feedback, as opposed to an infor- 
mational or knowledge of results component. 

A close analysis of Page's (1958a) study 
reveals a very nontheoretical attitude. Page 
used the terms "praise" and "blame," thus 
implicitly considering teacher's comments to 



be reinforcers; however, throughout the paper, 
words such as "reinforcement, " "extinction, " 
and "secondary reinforcers" were conspicu- 
ously absent. It was believed that this very 
empirical and pragmatic philosophy would 
prove to be the most efficient way of handling 
the present problem. 



.A survey of the literature reveals a great 
deal of latitude with regard to the above phil- 
osophy. The studies range from the very non- 
theor^tical article of Page's to the rather 
narroiv S-R formulations of Skinner. How- 
ever, ja great deal of creditbelongs to Skinner 
for th^ reason that he brought the popular mis- 
conception that "learning is its own best re- 
ward"! to task (Skinner and Rogers, 1956) 
while criticizing the commonly held belief that 
the learning process, or knowledge itself, is 
. a ttributable to something inside the individual. 
Furthermore, Skinner and Holland (I960) felt 
that compliance with the above attitudes auto- 
matically put the entire responsibility for 
learning upon the student, giving little regard 
to any possible inadequacies of the training 
program. They were of the opinion that the 
responsibility for learning should be carried 
by the teacher and the teaching situation. In 
his analysis of Skinnerian methodology, Hively 
(1959) also elaborated on who was to be re- 



sponsible for learning in suggesting that the 
reinforcement function in the control of be- 
havior implied a series of operations analo- 
gous to those employed by a skilled private 
tutor. 

In a further elaboration of this idea, 
Staats and Staats (1962, 1963), discussed 

"achievement behaviors. " They introduced 
the theory that, in a naive individual, over- 
coming obstacles and doing something difficult 
was not itself originally reinforcing. For 
some children, working at certain tasks such 
as school work may be heavily reinforced. 
The parents may be the source for much of 
this reinforcement, especially if theyare seen 
as highly reinforcing by the child. If this is 
the case, any type of feedback paired with 
parental approval (good grades) should also 
take on reinforcing properties. Thus, a child 
who is raised in an environmentwhere objects 



and events pertaining to school have been in 
contiguous association with positive reinforc- 
ers should find a more abundant supply c/f re- 
inforcing stimuli in the school situation./ 

In an article specifically related to the ef- 
iectiveness of verbal reinforcement, McjDavid 
(1959) concluded that the more effective /social 
approval is as a reward, the greater the 1 moti- 
vational or incentive value, and consequently, 
the greater the probability of high scholastic 
achievement. Furness (1958) accomplished a 
very thorough analysis of all factors, both 
environmental and organismic, involved in 
successful spelling behavior and concluded 
that verbal reinforcement is one of the most 
important. 

The effects of praise and blame as a func- 
tion of intelligence have been investigated by 
Kennedy, Turner, and Lindner (1962). Rele- 
vant to the present investigation was the fact 
that they studied the effects of praise and 
blame without using formal S-R terminology. 
The one unfortunate aspect, however, re- 
volved around their learning task which was a 
simple visual discrimination problem, far 
removed from the normal course work of high 
school students. 

There has been a vast amount of research 
revolving around the effects of feedback, yet 
very few investigators have studied the effects 
of written comments on test papers. The 
most exhaustive study of this variable was 
accomplished by Page (i 958a). Page used 74 
randomly selected secondary teachers, who 
were teaching a total of 2, 139 students. The 
teachers administered to their respective stu- 
dents whatever objective tests would occur in 
the usual course of instruction. After scoring 
and grading the test papers in their customary 
way, and matching the students by perfor- 
mance, they randomly assigned the papers to 
one of three treatment groups. The No Com- 
ment group received no marks beyond those 
for grading. The Free Comment group re- 
ceived whatever comments the teachers felt 
were appropriate for the particular students 
and test. The Specified Comment group re- 
ceived certain uniform comments, designated 
beforehand by Page for all similar letter 
grades, which were felt to be generally "en- 
couraging. " The teachers returned the tests 
to the students without any unusual attention. 
The scores on the next objective test became 
the criterion of comment effect. Page found 
that students who had received either a free 
or specified comment on the first test did sig- 
nificantly better on the second test than did 
those students who received nothing b u t a 
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numerical score and letter grade on the first 
test. These results held across different 
schools, ages, and grade-point averages of the 
students involved. The opinion held by many 
teachers that the better students would be more 
responsive to the comments was notverified. 

Page's study demonstrates two very im- 
portant points. First, it illustrates that meth- 
odologically "good" research can be done in a 
normal classroom setting. Thus, we find pre- 
sent one of those rarities of psychological re- 
search, a well controlled and well designed 
study whose data was of immediate use some- 
where besides the laboratory. Second, Page 
provided proof that a combination of both’in- 
f or mational (grades) and reinforcing (person- 
alized comments) feedback is superior to in- 
formational feedback alone in positively affect- 
ing scholastic performance. Related to the 
idea of comment inclusion is the hypothesis 
that a student who sees a comment, positive 
or negative, on his work will feel that the 
teacher mus t really be interested in him. This 
inference was in agreement with the results of 
Prosad and Singh (1962), who illustrated that 
undergraduate students feel that the better 
teachers are the ones who show an individual 
interest in them. 

In addition to analyzing the effects which 
written comment inclusion may have on scho- 
lastic performance, the present study intended 
to ascertain what effects, if any, these com- 
ments may have in changing a student's atti - 
tude. The popular conception that scholastic 
achievement and attitudes toward school are 
closely related has received a great deal of 
empirical support: Quay (1959), Bostrum, 

Vlandis, and Rosenbaum (1961), Brodie (1964), 
Aiken and Dreger (1961), and Weaver (1959). 
However, Wright and Jung (1959) presented 
1,011 excellent reasons for not considering 
this relationship automatic. They investigated 
the reasons that 1,011 students who finished 
in the top 10% of their high school class did 
not desire to continue their education. Among 
the most often stated reasons was a specific 
dislike for school and associated factors. 

Normally, attitudes and behavior have 
been considered as two separate entities, 
with one seen as causing a change in the other. 
The majority of studies in this area have dealt 
with the causal effects of behavior change upon 
the changing of an attitude. The social psy- 
chological research of Festinger centers 
around the theory of cognitive dissonance as 
an explanatory concept. His model, as well 
as other cognitive consistency models of atti- 
tude change ("Congruity, " "Balance"), are 



reviewed by Cohen (1964). 

For this study, the present writer was un- 
able to see the necessity for treating behavior 
and attitude in a causal fashion. F -n, it 
was decided to treat these entities as explicit 
(behavior) and implicit (attitude) responses, 
which were subject to change not as a function 
ofeachother, but rather as a function of feed- 
back in the form of teacher's comments. 

The following hypotheses were suggested: 

A. Returned test papers bearing Free and 
Specified written comments of teachers, 
along with letter or numerical grades and 
error corrections, are associated with 
higher student attainment in ninth grade 
English over' a short time period, more 
so than are tests which are returned with 
No Comments, merely containing a letter 
or numerical grade and error corrections. 

B. Over a longer period of time only Free 
written comments by teachers, along with 
letter or numerical grades and error cor- 
rections on returned test paper, are 
associated with higher student attain- 
ment. 

C. Attitudes toward ninth grade English are 
positively influenced by the inclusion of 
Free Comments or Specified Comments 
but not by No Comments. 

Indirectly, Hypothesis A is opposed to the 
findings of Lintz and Brackbill (1966) whose 
comparisons of money rewards and flashing 
lights have shown little effect on performance. 
Schvaneveldt (1965) also felt that experiments 
with human adults have demonstrated null ef- 
fects regarding the manipulation of reward 
independent of information. While it is agreed 
that the inclusion of written comments with- 
out error corrections or letter grades would 
not be sufficient to improve school perfor - 
mance, it is also felt that a combination of 
written comments with error corrections and 
letter grades is superior in improving school 
attainment to a test paper being returned 
which contains only error corrections and a 
letter grade. 

Hypothesis B represents an elaboration 
on the findings of Page (1 958a) who found no 
significant differences in the effects of Free 
and Specified Comments on school perfor- 
mance. His study only covered the amount of 
time it took to administer two tests. It is 
suggested that as more tests are given over 
a greater period of time, the student will be- 
come "immune" to s p e c if ie d or s t o c k 
comments. 
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PROCEDURE 



SUBJECTS 



The sample was drawn from students 
taking ninth grade English at a high school in 
a large Midwest city. From all teachers in- 
structing ninth grade English at the school, 
three were chosen at random. From each of 
the three teachers, three classes were ran- 
domly chosen. The final N consisted of 225 
students. 

DATA GATHERED 

During a period of six weeks, each teach- 
er gave four tests which were not of the long 
essay type. (See Sweet, 1966, fora more de- 
tailed description of the instructions given to 
the teachers. ) When each set of tests had 
been corrected, the teacher returned the test 
papers with the numerical score and letter 
grade as earned. Each testwas returned be- 
fore another one was given. The experimen- 
tal treatments were as follows (Page, 1958): 
No Comment (N) students received no com- 
ments, just the numerical score and letter 
grade. Free Comment (F) students received 
whatever comment the teacher felt it desire- 
able to make. The Specified Comment (S) 
students received comments designated in 
advance for each letter grade as follows: 

A - Excellent! Keep it up. 

B - Good work. Keep it up. 

C - Perhaps try to do still better. 

D - Let's bring this up. 

F - Let's raise this grade! 

Attitude inventory scores were collected 
on two occasions (See Sweet, 1966). The 
students filled out the inventory the day be- 
fore the administration of the first test and 
soon after the return of the fourth test. It 
took about twenty minutes on each occasion 



to administer the inventory. The inventory 
developed for this study was based on Osgood's 
semantic differential (Osgood, Suci, and 
Tannenbaum, 1957) and used the three 
scales which were shown to be most factori- 
ally pure with regard to the " evaluative " 
dimension. 

The decision to use the differential in 
this study was based on Osgood's opinion that 
an attitude is some portion of the internal 
mediational activity and thus a part of the se- 
mantic structure of the individual. This opin- 
ion originated from the idea of an attitude as 
being a learned implicit response which is po- 
tentially bipolar and which mediates all eval- 
uative behavior. Itwas decided to use the dif- 
ferential in this study because of the above 
opinions, and because an apparent "evaluative 
dimension" had been isolated. 

The following is a sample from the dif- 
ferential, and the "responses" to a particular 
item: 

Charles Dickens 

Good 1 2 3 4(5)6 7 Bad 

Valuable 12 3 5 6 7 Worthless 

Positive 1 2 4 5 6 7 Negative 

For the item "Charles Dickens, " the student 
would have a total score of twelve. For each 
item, the possible score ranged from three to 
twenty-one. 

METHOD 

General 

The students were assigned to one of the 
thvee treatments in the following manner. 
The students of each teacher within each 
class were ranked according to their first 
semester grade in English. In this respect, 
the present study differed from that of Page, 
who ranked and as signed to a treatment group 
on the basis of the scores on the first objective 
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test. Each of the top three students of each 
teacher was randomly assigned to one of the 
three treatment groups. This procedure was 
then repeated with the next best three, and so 
on, until all students had been assigned to one 
of the treatment groups. In all, there were 
225 students, making a total of 75 subjects 
within each treatment. 

Scholastic Achievement 



In all, there were nine classes, consisting 
of 75 levels making up a total 225 students. 
This is illustrated in Table 2. 



Table 2 

Illustration of the Number of Levels 
in Each Class 



The effects of comments were judged by 
the scores achieved on the second and fourth 
test, regardless of the nature of that test. 
This was done to investigate the "short term" 
and "long term" effects of the comments and 
thus verify or refute Hypothesis B. Certain 
statistical problems were present, since each 
test differed from the others with regard to 
practically every conceivable test variable 
suchas subject matter, length, and difficulty. 
However, when the tasks were regarded pri- 
marily as ranking instruments, most of the 
difficulties disappeared. For example, a 
class of twenty-seven students formed nine 
levels on the basis of the first semester 
grades in English. Each level consisted of 
three students, with each student receiving a 
different treatment: No Comment, Free Com- 
ment, or Specified Comment. Students then 
were given raw scores on each of the four 
tests within a six- week period. On the basis 
of such scores, they were assigned rankings 
within levels as illustrated in Table 1, Part B. 

Table 1 

Illustration of Ranked Daca 



Part A 


Part B 




(Raw Sco' 


es on 


(Ranks within levels 


Test 2) 


on 


Test 2) 


Treatment 


Treatment 


Level N 


S 


F 


N 


S 


F 


1 33 


31 


34 


2 


1 


3 


2 30 


25 


32 


2 


1 


3 


3 29 


33 


23 


2 


3 


1 


9 14 


25 


21 


• 

1 


« 

3 


• 

2 


Sum 






19 


21 


20 


Sum Ranks 




1 


3 


2 



Note: Taken From Page (1958a), p. 312. 



Class 


No. of Levels per Class 


1 


7 


(21 Students) 


2 


8 


(24 Students) 


3 


9 


(27 Students) 


4 


9 


(27 Students) 


5 


9 


(27 Students) 


6 


_9 


(27 Students) 


Total 


75 


225 



The F riedman Two - Way Analysis of 
Variance was used to analyze the treatment 
effects when students were considered as 
matched independently from one common pop- 
ulation. In this case, the summation of the 
rankings of 75 levels were analyzed. In addi- 
tion, the Friedman was used analyze the 
treatment effects when treatment groups with- 
in classes were regarded as intact groups. 
For this particular analysis, the sums of the 
the ranks in each class were ranked. Refer- 
ring back to the bottom of Table 1, the sums 
19, 21, and 20 are ranked. Their rankings 
would be 1, 3, and 2, respectively. This pro- 
cedure was carried out nine times, once for 
each of the nine classes. 

In order to analyze the effects of only 
two treatments at a time, one treatment was 
dropped out, and the other two were reranked. 
In this situation, the number of treatments (K) 
was equal to two, which introduced the prob- 
lem of the feasibility of using the Friedman in 
such an analysis. Friedman (1937) stated that 
in this special case (K = 2) the method of 
ranks was equivalent to the binomial series 
test, which is equivalent to the sign test when 
N>25. Class-group data could not be com- 
puted in this fashion because of the fact that 
the number of the distribution no longer 
approaches normality when N < 25. (N = 9). 
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Table 3 



Procedure Used For Reranking Data 



Part A 


Part B 


Part C 


Part D 


Part E 


Raw Score 


Ranking (N,S, F) 


ReRanking 


ReRanking 


ReRanking 


N S F 


N S F 


F S 


F N 


S N 


33 31 34 


2 1 3 


2 1 


2 1 


1 2 



Note: For a more complete illustration, see Appendix to Sweet (1966). 



Attitude Change 

Attitudes toward ninth grade English 
were measured using the three most factori- 
allv pure scales of Osgood’s evaluative di- 
mension. Scoring was arranged so that 
smaller scores indicated more positive atti- 
tudes. The 7/ ilcoxon Matched-Pairs Signed- 
Ranks test was used in order to discover any 
possible attitude change. It was far more 
powerful than the simple sign test in that it 
utilized information about magnitude , as 
well as direction of change. It gave more 
weight to a pair which showed a large dif- 
ference between two conditions, than to a 
pair which showed a small difference. 



STATISTICAL TREATMENT 



A nonparametric test, the Friedman 
two-way ANOVA, was used to compare the 
effects of teacher's comments on test per- 
formance. This test is useful when the 
measurement of the variable is in at least an 
ordinal scale. It was previously stated (page 
6), that the Friedman was used in analyzing 
the overall treatment effects both across all 
students, and between the nine classes. This 
was accomplished in the following manner: 



(a) Across individual subjects. In this 
case, the summations of the rank- 
ings for seventy-five levels were 
analyzed. The three grand sums 
(one for each treatment), were then 
used to compute the value of Xr^ 
using formula 2a (In Siegel, 1956): 



Xr 2 = 



12 



(Ri r + (r 2 ) 



NK(K + 1) 

+ (Rj) 2 J - 3N(K + 1), 

where K = number of treatments (3) 
N = number of levels or rows 
(75) 

R = sum of the ranks for each 
column 



(b) Across classes . Here, the 

Friedman (formula 2a) was used to 
compare the treatment effects when 
treatment groups within classes were 
regarded as intact groups. In this 
situation, N = 9 rather than 75. 



Formula 2a was also applicable in ana- 
lyzing only two treatments at a time. Here, 
K = 2, not 3. This analysis could not be 
undertaken for comparing between class ef- 
fects for reasons already stated on page 6. 

The Wilcoxon test necessitated the com- 
putation of T, the statistic on which the 
Wilcoxon is based. In order to compute T, 
let d equal the difference between the score 
on the first and second administration of the 
inventory. All of the d's were then ranked 
without regard to sign. Then, the sign of the 
difference was affixed to each rank indicat- 
ing which ranks arose from negative d’s 
(positive change in attitude), and which ranks 
arose from positive d's (negative change in 
attitude). The number of ranks having a + d 
and a - d were tabulated. The Wilcoxon T is 
the summation of those ranks having the least 
frequent sign. T was the summation of posi- 
tive ranks under all three treatments indica- 
ting that most students regardless of treat- 
ment, experienced a positive change in their 
attitudes toward ninth grade English. 

Once the Wilcoxon T was computed, it 
was introduced into formula 2b (Siegel, 1956) 
which is as follows: 



with Mtj. = Mean = - 

and , T . SD v n(n *V n+i> . 



and N = Number of levels. 



With N > 25, the sum of the ranks, T, is 
approximately normally distributed, allowing 
for the computation of a Z score (formula 2b). 
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III 

RESULTS 



Table 4 illustrates the overall compari- 
son of all three treatment effects across 
individual subjects and classes for Tests 2 
and 4. From Test 2 to Test 4, the perfor- 
mance of students under treatment N re- 
mained stable , while deteriorating under 
treatment S, and improving under treatment 
F. Test 2, or short term data, indicated 
little if any treatment effect with regard to 
individual subjects. Treated as independent 
class groups* however, some treatment effect 
was noticed with the probability of getting a 
Xr^ of 4. 832 or larger no greater than 10%. 

The probability (P < . 15) associated 
with the observed Xr 2 of 3.845 for the Tist 
4 individual - subjects data, indicated that 
there may be a definite trend with regard to 
the effects of written comments. Once again. 



there was a moderately low probability (P < 
. 10) associated with the Test 4 class groups 
data. 

Table 5 lists the results when only two 
treatments at a time were compared. The 
Test 2 data (short term) seemed to present 
little evidence favoring either Specified or 
Free Comments over No Comments. How- 
ever, trends in this direction were present 
as witnessed by the relatively small proba- 
bilities associated with the observed Xr 2 
values between FN and SN in comparison 
with the large probability (P <. 60) associat- 
ed with the observed Xr 2 values between 
treatments F and S. 

The Test 4 data (long term) indicated 
that the majority of any treatment effect was 
related to the Free Comment condition . 



Table 4 

Friedman Test of Overall Treatment Effects 







N 


S 


F 


df 


Xr 2 


P 


Test 2 


Individual Subjects 


140.5 


158 


151.5 


2 


2.086 


<.35 




Class -group Subjects 


17.5 


20.5 


17 


2 


4.832 


<.10 


Test 4 


Individual Subjects 


140.5 


146 


163.5 


2 


3. 845 


<.15 




Class -group Subjects 


13.5 


18 


22.5 


2 


4.50 


<.10 


Note: 


Modeled after Friedman 


in Siegel (1956) 


, pp. 166- 


173. 









Table 5 



Friedman Analysis of Reranked Data (K = 2) 
Across Individual Subjects 







N 


s 


F 


df 


■ 

u\.T 


P 




Between F and S 




115 


110 


1 


.33 


<.60 


Test 2 


Between F and N 


107 




118 


1 


1.61 


<.20 




Between S and N 


106 


119 




1 


2.25 


<.15 




Between F and S 




107 


118 


1 


1.61 


<.20 


Test 4 


Between F and N 


103 




122 


1 


4.81 


< 03 




Between S and N 


no 


115 




1 


.33 


< 60 
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This was evident in all three Test 4 compari- 
sons. In the comparison between treatments 
F and S, the probabilities associated with the 
observed Xr^'s dropped from 60% (Test 2) to 
20% (Test 4), while the probabilities associ- 
ated with the Xr^ values between S and N 
rose from 15% (Test 2) to 60% (Test 4). The 
comparison between treatment F and treat- 
ment N indicated that there was only a 3% 
chance of being wrong in considering these 
two ranked summations as being from dif- 
ferent populations . 



Table 6 seems to clearly illustrate that 
only under the Free Comment treatment 
were attitudes significantly changed in a pos- 
itive direction. This effect was significant 
at the 5% level. The Specified Comment and 
No Comment conditions were almost equally 
nonsignificant, indicating that the inclusion 
of Specified Comments was no more effective 
in changing attitudes thanwere No Comments. 



Table 6 

Attitude Change Based On The Wilcoxon 
Matched-Pairs Signed-Ranks Test 



Treatment 


Wilcoxon T 


Mean 


Sd 


N 


Z 


P 


F 


950.5 


1387.5 


185.62 


74 


-2.35 


< .05 


S 


1238.5 


1425 


189.37 


75 


- .98 


< .35 


N 


1155.5 


1350.5 


181.89 


73 


-1.07 


< .30 


Note: Modeled after Wilcoxon in Siegel (1956) pp. 75-83. 



Whenever an individual received the same score on both administrations of the 
inventory, the score was dropped from the analysis, thus explaining why N 
does not equal 75 in the F and N treatments. 
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IV 

DISCUSSION 



SCHOLASTIC ACHIEVEMENT 

Before considering the theoretical rea- 
sons associated with the results listed in 
Tables 4 and 5, there are certain statistical 
considerations which must be discussed. In 
analyzing overall treatment effects, Page 
( 1958a) treated his ranked data in three 
ways. Two of these methods, which have al- 
ready been discussed (Table 1), were also 
used in the present investigation. First, all 
the ranks were summed, making it possible 
to analyze the overall treatment effects with 
regard to individual subjects. Second, the 
sums of the ranks within each class were 
ranked. This allowed for the analysis of 
overall treatment effects with regard to clas s- 
group subjects. The Friedman Two-Way 
ANOVA is a nonparametric test lacking the 
power inherent in the parametric F test. 

Page also analyzed his overall treatment 
effects in a third fashion. He took the sum- 
mation of ranks in each column within each 
of his seventy-four classes, and then divided 
this sum by the number of levels in each 
class, with the result being a mean rank 
within treatment within class. In his reasons 
for doing this, Page commented, "This score 
proved very useful since it fulfilled certain 
requirements for parametric data D>; 31 3 * 1 . " 

Keeping in mind the statistical fact of 
life that a not very sensitive nonparametric 
test was used, the overall Test 2 results 
listed in Table 4 give limited support to the 
Page study and the predictions stated in 
Hypothesis A., Depending upon how large an 
one is willing to accept, the results listed 
in the Test 2 data of Table 5, could either 
support or dispute the Page findings and the 
short term prediction of treatment effects 
given in Hypothesis A. For instance, the 
short term results in Tables 4 and 5 could be 
interpreted in two ways. First, it could be 
argued that some promising trends were evi- 
dent, that both the F and S treatments seemed 
to have the same short-term positive effect 



upon test performance. A second argument 
would be that by the end of the second test, 
any effects which comment inclusion, either 
free or specified, had were negligible. The 
first argument would lend support to the Page 
data, and the verification of Hypothesis A. 
However, the data in Table 3 cannot be ig- 
nored. While a possible trend may be pre- 
sent, the overall treatment effects with re- 
gard to individual subjects were very dis - 
couraging, thus causing the writer to question 
the Page results, being of the opinion that 
neither treatment had any appreciable short 
term effect on test performance. 

A study of the long term (Test 4) data did 
indicate a definite treatment effect. The Table 
4, Test 4 results provided the first hint of long 
term effects, across both individual subjects 
and classes. Table 5 indicated more specifi- 
cally that any treatment effect was due mainly 
to the Free Comment condition. This would 
tend to support the predictions of Hypothesis 
B, but was definitely not in agreement with 
Page, who in his analysis after two tests 
found no significant differences between Free 
and Specified Comments. 

It was hypothesized that Specified Com- 
ments would lose l..eir effect by the time the 
fourth test was taken, the reasons being that 
the students would rapidly become "immune" 
to the appearance of "stock" comments. 
Psychophysical correlates of this notion come 
from "stimulus satiation" studies. Wolfle 
(1951) states that if training is of a prolonged 
and monotonous nature, variety in the stimu- 
lus materials may speed up learning rather 
than retard it. Seashore (1944) attempted to 
combat monotony in the training of radio op- 
erators by using a great variety of drill ma- 
terials. Men trained under these conditions 
learned more rapidly than did men having 
less varied drills. Even though the stimulus 
materials being varied in the above two studies 
are not specifically related to feedback, the 
results are easily applicable to an explanation 
concerning the lack of long term treatment 
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effects with Specified Comments. It could be 
postulated that students become bored after 
seeing the same comment (stimulus) time and 
time again, and that something similar to the 
"satiation effects" found in lab studies could 
be occurring. 

One further explanation concerning the 
lackof improvement under the specified con- 
dition revolves around the impersonal nature 
of the Specified Comments. It is possible 
that allowing the teacher to write what she 
desired on the student's paper caused her to 
make her comments far more personal, since 
she was probably familiar with the past prog- 
ress and general makeup of each of her 
students. 

No definite answer can yet be given for 
the effects, or the lack of them, which Spec- 
ified Comments have, though the problem is 
interesting and easily open to investigation. 
At this point, all that can be said with cer- 
tainty is that comments of either a free or 
specified nature have little if any short term 
effect on test performance, but that over a 
longer period of time the inclusion of Free 
Comments has a significant effect on scho- 
lastic performance. 

ATTITUDE CHANGE 

The most striking results came from the 
effects which comments had on the attitudes 
toward English. It was hypothesized that 
comments of either type would tend to change 
attitudes in a positive direction. The evidence 
presented in Table 6 did not entirely coincide 
with the predictions of Hypothesis C. Like 
the test performance data, it seemed that 
only Free Comment inclusion had a positive 
effect on attitude change. These results pro- 
vided fairly good evidence supporting the 
opinion that attitudes are simply responses 
which are governedby many of the same laws 
related to other behaviors. 

Of extreme interest was the fact that 
Specified Comments did not have any signifi- 
cant effects on attitude change. This set of 
data could serve to supplement the results 
concerning the authenticity surrounding the 
effects of Specified Comments on test per- 
formance. It illustrated that under certain 
conditions, there is a very strong R-R rela- 
tionship between attitudes and test perfor- 
mance in ninth grade English. It could be 



hypothesized that we learn hierarchies of 
verbal and nonverbal responses, and that 
this is the reason that attitude tests (obser- 
vations of verbal behavior) may be used to 
predict test performance in English. As a 
final note, the possibility that Specified Com- 
ments may have little positive effect on either 
test performance or attitudes should serve as 
a warning to teachers who follow the rather 
uncreative and depersonalized practice of 
using "stock" comments, thinking that they 
are enough. Instead, the teacher should, 
when time permits, make truly personal com- 
ments, comments from which a student can 
get the feeling that the teacher really is con- 
scious of his efforts, or the lack of them. 

CONCLUDING STATEMENTS 

A word of caution is advised, however, 
before overemphasizing the positive results 
related to the effects of Free Comments. 
Adding to the fact that the results were not 
overly significant is the more important con- 
sideration of defining "long term" effects. 
The Page study ran for about two weeks, the 
present study six weeks. This is quite a few 
weeks short of a semester, or school year, 
and any further investigations of the comment 
variable should extend for a longer period. 
Future studies of this nature should not have 
to hide behind an operational definition of 
"long term. " Once it has been decided to 
execute such a longitudinal study, there is 
one other important factor which must be in- 
vestigated. Page made a detailed analysis of 
factors such as letter grade, school year, and 
school and found no significant effects due to 
these factors. However, he failed to control 
for the sex of both the students and teachers. 
The present study was blocked controlling for 
sex, but the analysis of its effects was not 
within the temporal or statistical scope of 
this paper. However, study of the ranked 
data indicated that boys were more affected 
by the comments than were girls. These 
data seem to go against both common sense 
and research evidence. The fact that all 
three teachers were young females may have 
had something to do with this. In the future, 
a longitudinal study controlling for sex will 
be necessary if a more accurate appraisal of 
comment effects is desired. 
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